In this paper, we study the problem of estimating uniformly well the meanvalues of several distributions given a finite budget of samples. If thevariance of the distributions were known, one could design an optimal samplingstrategy by collecting a number of independent samples per distribution that isproportional to their variance. However, in the more realistic case where thedistributions are not known in advance, one needs to design adaptive samplingstrategies in order to select which distribution to sample from according tothe previously observed samples. We describe two strategies based on pullingthe distributions a number of times that is proportional to a high-probabilityupper-confidence-bound on their variance (built from previous observed samples)and report a finite-sample performance analysis on the excess estimation errorcompared to the optimal allocation. We show that the performance of theseallocation strategies depends not only on the variances but also on the fullshape of the distributions.
展开▼